Analyses with words as outcome

Set-up

Packages

library(brms)
library(ggplot2)
library(kableExtra)
library(lme4)
library(lmerTest)
library(rmarkdown)
library(performance)
library(see)
library(sjmisc)
library(tidyverse)

options(
  digits = 3
)
set.seed(170819)

Custom functions

# function to silence brms output
hush <- 
  function(
    code
    ){
    sink("/dev/null")
    tmp = code
    sink()
    return(tmp)
    }

Data

d_words <- read_csv("data/DataAggregated_T1T2_nwords.csv")
d <- read_csv("data/data.csv")

# get words from dataframe
d$n_Words <- d_words$n_Words

# same as above; but original file name:
# d <- read_csv("data/DataAggregated_T1T2_costsbenefits.csv")

# load image for work in IDE
# load("data/image_words.RData")

d <- d |> 
  rename(
    group = roles,
    gender = DE01_T1,
    age = DE02_01_T1,
    pol_stance = DE06_01_T1
  ) |> 
  mutate(
    female = as.logical(2 - gender),
    gender = factor(gender, labels = c("female", "male")),
    n_Words = replace_na(n_Words, 0)
  )

# recode to make as sum coding
d$anonymity_dev <- factor(d$anonymity)
contrasts(d$anonymity_dev) <- contr.sum(2)
d$persistence_dev <- factor(d$persistence)
contrasts(d$persistence_dev) <- contr.sum(2)

Descriptives

Let’s inspect distribution of words communicated.

ggplot(d, aes(n_Words)) +
  geom_histogram(binwidth = 50)

Looks like a zero-inflated poisson distribution. Confirms our preregistered approach to analyze data using zero-inflated Poisson approach.

nrow(d[d$n_Words == 0, ]) / nrow(d)
[1] 0.206

Overall, 21% of participants without any written words.

We first look at the experimental group’s descriptives

d |> 
  group_by(persistence) |> 
  summarize(n_Words_m = mean(n_Words)) |> 
  as.data.frame() |> 
  kable()
persistence n_Words_m
high 365
low 422

Looking at persistence, we see there’s a slight difference among groups. Participants in the low persistence condition communicated more.

d |> 
  group_by(anonymity) |> 
  summarize(n_Words_m = mean(n_Words)) |> 
  as.data.frame() |> 
  kable()
anonymity n_Words_m
high 376
low 411

People who with less anonymity communicated more. But the difference isn’t particularly large.

d |> 
  group_by(persistence, anonymity) |> 
  summarize(n_Words_m = mean(n_Words)) |> 
  as.data.frame() |> 
  kable()
`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
persistence anonymity n_Words_m
high high 356
high low 375
low high 397
low low 448

Looking at both groups combined, we see that low anonymity and low persistence created highest participation.

d |> 
  group_by(group) |> 
  summarize(
    anonymity = anonymity[1],
    persistence = persistence[1],
    topic = topic[1],
    n_Words_m = mean(n_Words)
    ) |> 
  rmarkdown::paged_table()

Looking at the various individual groups, we do see some difference. Generally, this shows that communication varied across groups.

d |> 
  group_by(topic) |> 
  summarize(n_Words_m = mean(n_Words)) |> 
  as.data.frame() |> 
  kable()
topic n_Words_m
climate 388
gender 368
migration 426

Looking at topics specifically, we also see that there’s some variance.

Bayesian mixed effects modeling

We analyze the data using Bayesian modelling.

We use deviation/sum contrast coding (-.1, .1). Meaning, contrasts measure main effects of independent variables.

Fixed effects

We preregistered to analyze fixed effects.

fit_fe_1 <- 
  hush(
    brm(
      n_Words ~ 
        1 + persistence_dev * anonymity_dev + age + female + pol_stance +
        (1 | topic/group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 6000
      , warmup = 2000
      , family = zero_inflated_poisson()
      , control = list(
        adapt_delta = .95
        , max_treedepth = 12
        )
      , save_pars = save_pars(all = TRUE)
      , silent = 2
      )
  )
Warning: There were 559 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 237 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems

Shows some convergence warnings. Let’s inspect model.

plot(fit_fe_1, ask = FALSE)

Trace-plots look alright.

Let’s look at results.

summary(fit_fe_1)
Warning: There were 559 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: zero_inflated_poisson 
  Links: mu = log; zi = identity 
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group) 
   Data: d (Number of observations: 960) 
  Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.15      0.17     0.00     0.63 1.00     2715     4759

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.37      0.04     0.30     0.45 1.00     2772     4125

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                           5.78      0.11     5.55     6.02 1.00
persistence_dev1                   -0.04      0.05    -0.14     0.07 1.00
anonymity_dev1                     -0.05      0.05    -0.16     0.05 1.00
age                                 0.01      0.00     0.01     0.01 1.00
femaleTRUE                         -0.13      0.00    -0.13    -0.12 1.00
pol_stance                         -0.02      0.00    -0.02    -0.02 1.00
persistence_dev1:anonymity_dev1     0.05      0.05    -0.06     0.15 1.00
                                Bulk_ESS Tail_ESS
Intercept                           4283     4646
persistence_dev1                    3303     4336
anonymity_dev1                      3106     4651
age                                16884    12262
femaleTRUE                         10329     7781
pol_stance                         12176     8724
persistence_dev1:anonymity_dev1     3269     3896

Further Distributional Parameters:
   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi     0.21      0.01     0.18     0.23 1.00     8989     7641

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

No significant effect emerged.

Let’s inspect ICC

var_ratio_fe <- performance::variance_decomposition(
  fit_fe_1
  , by_group = TRUE)
var_ratio_fe
# Random Effect Variances and ICC

Conditioned on: all random effects

## Variance Ratio (comparable to ICC)
Ratio: 0.45  CI 95%: [0.12 0.66]

## Variances of Posterior Predicted Distribution
Conditioned on fixed effects: 43442.51  CI 95%: [27273.26 70002.16]
Conditioned on rand. effects: 79718.68  CI 95%: [74074.94 85167.06]

## Difference in Variances
Difference: 36192.27  CI 95%: [9511.86 52872.17]

45.485 percent of variance in words communicated explained by both topics and groups.

Let’s visualize results to see what they exactly mean.

p <- plot(
  conditional_effects(
    fit_fe_1
    ), 
  ask = FALSE,
  plot = FALSE
  )

p_anon <- 
  p[["anonymity_dev"]] +
  xlab("Anonymity") +
  ylab("Words") +
  scale_x_discrete(
    limits = rev
  )
  #    ) +
  # scale_y_continuous(
  #   limits = c(5, 14)
  #   , breaks = c(6, 8, 10, 12, 14)
  #   )

p_pers <- 
  p[["persistence_dev"]] +
  xlab("Persistence") +
  ylab("Words") +
  scale_x_discrete(
    limits = rev
   ) +
  # scale_y_continuous(
  #   limits = c(5, 14)
  #   , breaks = c(6, 8, 10, 12, 14)
  #   ) +
  theme(
    axis.title.y = element_blank()
    )

p_int <- 
  p[["persistence_dev:anonymity_dev"]] +
  xlab("Persistence") +
  scale_x_discrete(
    limits = rev
     ) +
  scale_color_discrete(
    labels = c("low", "high")
    ) +
  guides(
    fill = "none",
    color = guide_legend(
      title = "Anonymity"
      )
    ) +
  # scale_y_continuous(
  #   limits = c(5, 14)
  #   , breaks = c(6, 8, 10, 12, 14)
  #   ) +
  theme(
    axis.title.y = element_blank()
    )

plot <- cowplot::plot_grid(
  p_anon, p_pers, p_int, 
  labels = c('A', 'B', "C"), 
  nrow = 1,
  rel_widths = c(2, 2, 3)
  )

plot

ggsave("figures/results.png", plot, width = 8, height = 4)

Shows that there are no main effects. There seems to be a (nonsignificant) interaction effect. In low persistence environment, anonymity is conducive to communication; in high it’s the opposite.

Let’s look at posteriors

p_1 <- 
  pp_check(fit_fe_1) + 
  labs(title = "Zero-inflated poisson")
Using 10 posterior draws for ppc type 'dens_overlay' by default.
p_1

The actual distribution cannot be precisely reproduced, but it’s also not too far off.

Random effects

We preregistered to explore and compare models with random effects. So let’s model how the experimental conditions affect the outcomes differently depending on topic.

fit_re_1 <- 
  hush(
    brm(
      n_Words ~ 
        1 + persistence_dev * anonymity_dev + age + female + pol_stance +
        (1 + persistence_dev * anonymity_dev | topic) + 
        (1 | topic:group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 6000
      , warmup = 2000
      , family = zero_inflated_poisson()
      , control = list(
        adapt_delta = .95
        , max_treedepth = 15
        )
      , save_pars = save_pars(all = TRUE)
    )
  )
Warning: There were 525 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems

Shows some convergence warnings.

Let’s inspect model.

plot(fit_re_1, ask = FALSE)

Traceplots look alright.

Let’s look at results.

summary(fit_re_1)
Warning: There were 525 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: zero_inflated_poisson 
  Links: mu = log; zi = identity 
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 + persistence_dev * anonymity_dev | topic) + (1 | topic:group) 
   Data: d (Number of observations: 960) 
  Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
                                                      Estimate Est.Error
sd(Intercept)                                             0.26      0.41
sd(persistence_dev1)                                      0.30      0.45
sd(anonymity_dev1)                                        0.26      0.40
sd(persistence_dev1:anonymity_dev1)                       0.38      0.48
cor(Intercept,persistence_dev1)                          -0.01      0.47
cor(Intercept,anonymity_dev1)                            -0.00      0.47
cor(persistence_dev1,anonymity_dev1)                     -0.01      0.47
cor(Intercept,persistence_dev1:anonymity_dev1)           -0.04      0.46
cor(persistence_dev1,persistence_dev1:anonymity_dev1)     0.03      0.47
cor(anonymity_dev1,persistence_dev1:anonymity_dev1)      -0.01      0.46
                                                      l-95% CI u-95% CI Rhat
sd(Intercept)                                             0.00     1.43 1.00
sd(persistence_dev1)                                      0.01     1.64 1.00
sd(anonymity_dev1)                                        0.00     1.42 1.00
sd(persistence_dev1:anonymity_dev1)                       0.01     1.75 1.00
cor(Intercept,persistence_dev1)                          -0.85     0.83 1.00
cor(Intercept,anonymity_dev1)                            -0.83     0.84 1.00
cor(persistence_dev1,anonymity_dev1)                     -0.84     0.83 1.00
cor(Intercept,persistence_dev1:anonymity_dev1)           -0.85     0.81 1.00
cor(persistence_dev1,persistence_dev1:anonymity_dev1)    -0.82     0.85 1.00
cor(anonymity_dev1,persistence_dev1:anonymity_dev1)      -0.83     0.83 1.00
                                                      Bulk_ESS Tail_ESS
sd(Intercept)                                             6069     6047
sd(persistence_dev1)                                      4910     6104
sd(anonymity_dev1)                                        5291     5949
sd(persistence_dev1:anonymity_dev1)                       3597     4722
cor(Intercept,persistence_dev1)                          20373    11843
cor(Intercept,anonymity_dev1)                            22746    11797
cor(persistence_dev1,anonymity_dev1)                     17202    12534
cor(Intercept,persistence_dev1:anonymity_dev1)           19402    12987
cor(persistence_dev1,persistence_dev1:anonymity_dev1)    16036    13418
cor(anonymity_dev1,persistence_dev1:anonymity_dev1)      12041    13726

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.37      0.04     0.29     0.46 1.00     6304     9612

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                           5.77      0.25     5.24     6.21 1.00
persistence_dev1                   -0.04      0.27    -0.58     0.51 1.00
anonymity_dev1                     -0.06      0.24    -0.59     0.37 1.00
age                                 0.01      0.00     0.01     0.01 1.00
femaleTRUE                         -0.12      0.00    -0.13    -0.12 1.00
pol_stance                         -0.02      0.00    -0.02    -0.02 1.00
persistence_dev1:anonymity_dev1     0.05      0.33    -0.65     0.75 1.00
                                Bulk_ESS Tail_ESS
Intercept                           6778     4541
persistence_dev1                    7308     5714
anonymity_dev1                      7787     5343
age                                15915    15238
femaleTRUE                         22631    10396
pol_stance                         16822     9683
persistence_dev1:anonymity_dev1     6387     4994

Further Distributional Parameters:
   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi     0.21      0.01     0.18     0.23 1.00    31492    11056

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Again, no main or interaction effects.

Let’s see if the random effects model fits better

fit_fe_1 <- add_criterion(
  fit_fe_1
  , "kfold" 
  , K = 10
  , cores = 4
  )
Fitting model 1 out of 10
Fitting model 2 out of 10
Fitting model 3 out of 10
Fitting model 4 out of 10
Fitting model 5 out of 10
Fitting model 6 out of 10
Fitting model 7 out of 10
Fitting model 8 out of 10
Fitting model 9 out of 10
Fitting model 10 out of 10
fit_re_1 <- add_criterion(
  fit_re_1
  , "kfold"
  , K = 10
  , cores = 4
  )
Fitting model 1 out of 10
Start sampling
Fitting model 2 out of 10
Start sampling
Fitting model 3 out of 10
Start sampling
Fitting model 4 out of 10
Start sampling
Fitting model 5 out of 10
Start sampling
Fitting model 6 out of 10
Start sampling
Fitting model 7 out of 10
Start sampling
Fitting model 8 out of 10
Start sampling
Fitting model 9 out of 10
Start sampling
Fitting model 10 out of 10
Start sampling
comp_1 <- loo_compare(fit_fe_1, fit_re_1, criterion = "kfold")
comp_1
         elpd_diff se_diff
fit_re_1     0.0       0.0
fit_fe_1 -2842.7    2820.7

Although model comparisons showed that the model with random effects fitted better, the difference was not significant (Δ ELPD = -2842.75, 95% CI [-8371.271, 2685.78]. Hence, for reasons of parsimony the model with fixed effects is preferred.

Hurdle

Let’s now estimate a fixed effects model with hurdles.

fit_hrdl_1 <- 
  hush(
    brm(
      bf(
        n_Words ~ 
          1 + persistence_dev * anonymity_dev + age + female + pol_stance +
          (1 | topic) + 
          (1 | topic:group),
        zi ~ 
          1 + persistence_dev * anonymity_dev + age + female + pol_stance +
          (1 | topic) + 
          (1 | topic:group)
      )
    , data = d
    , chains = 4
    , cores = 4
    , iter = 6000
    , warmup = 2000
    , family = zero_inflated_poisson()
    , control = list(
      adapt_delta = .95
      , max_treedepth = 15
      )
    )
  )
Warning: There were 518 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems

Agian, some warnings.

Let’s inspect model.

plot(fit_hrdl_1, ask = FALSE)

Trace-plots look alright.

summary(fit_hrdl_1)
Warning: There were 518 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: zero_inflated_poisson 
  Links: mu = log; zi = logit 
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic) + (1 | topic:group) 
         zi ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic) + (1 | topic:group)
   Data: d (Number of observations: 960) 
  Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)        0.14      0.16     0.00     0.60 1.00     4317     6264
sd(zi_Intercept)     0.35      0.50     0.01     1.76 1.00     4375     5635

~topic:group (Number of levels: 48) 
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)        0.37      0.04     0.30     0.46 1.00     4196     6563
sd(zi_Intercept)     0.24      0.14     0.02     0.53 1.00     4472     6744

Regression Coefficients:
                                   Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                              5.78      0.11     5.56     6.01 1.00
zi_Intercept                          -1.87      0.51    -2.79    -0.84 1.00
persistence_dev1                      -0.04      0.05    -0.14     0.07 1.00
anonymity_dev1                        -0.05      0.05    -0.16     0.05 1.00
age                                    0.01      0.00     0.01     0.01 1.00
femaleTRUE                            -0.12      0.00    -0.13    -0.12 1.00
pol_stance                            -0.02      0.00    -0.02    -0.02 1.00
persistence_dev1:anonymity_dev1        0.04      0.05    -0.06     0.15 1.00
zi_persistence_dev1                    0.05      0.09    -0.13     0.23 1.00
zi_anonymity_dev1                      0.03      0.09    -0.14     0.21 1.00
zi_age                                 0.02      0.01     0.00     0.03 1.00
zi_femaleTRUE                          0.14      0.18    -0.21     0.49 1.00
zi_pol_stance                         -0.05      0.04    -0.13     0.03 1.00
zi_persistence_dev1:anonymity_dev1     0.01      0.09    -0.17     0.19 1.00
                                   Bulk_ESS Tail_ESS
Intercept                              5212     5627
zi_Intercept                          10949     6527
persistence_dev1                       2858     5138
anonymity_dev1                         3055     5142
age                                   16275    12055
femaleTRUE                            29563    10725
pol_stance                            17068     9763
persistence_dev1:anonymity_dev1        2754     5019
zi_persistence_dev1                   20542    11569
zi_anonymity_dev1                     19022    11360
zi_age                                23044    11719
zi_femaleTRUE                         24228    10957
zi_pol_stance                         25684    11393
zi_persistence_dev1:anonymity_dev1    17845    10858

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Same results, no main effects, slightly larger but still nonsignificant interaction effect.

Exploratory Analyses

Frequentist

Look at results from a frequentist perspective.

Fixed effects

Estimate nested model.

fit_fe_1_frq <- 
  lmer(
    n_Words ~ 
      1 + 
      (1 | topic/group) + 
      persistence_dev * anonymity_dev + age + female + pol_stance
    , data = d
    )

summary(fit_fe_1_frq)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: n_Words ~ 1 + (1 | topic/group) + persistence_dev * anonymity_dev +  
    age + female + pol_stance
   Data: d

REML criterion at convergence: 15030

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-1.208 -0.507 -0.235  0.221 13.409 

Random effects:
 Groups      Name        Variance Std.Dev.
 group:topic (Intercept)  13591   117     
 topic       (Intercept)      0     0     
 Residual                381808   618     
Number of obs: 960, groups:  group:topic, 48; topic, 3

Fixed effects:
                                Estimate Std. Error      df t value Pr(>|t|)   
(Intercept)                      277.629    104.780 910.923    2.65   0.0082 **
persistence_dev1                 -28.500     26.099  43.505   -1.09   0.2808   
anonymity_dev1                   -24.313     26.260  44.563   -0.93   0.3595   
age                                3.663      1.760 950.062    2.08   0.0377 * 
femaleTRUE                       -60.301     43.364 946.729   -1.39   0.1647   
pol_stance                         0.468     10.152 949.443    0.05   0.9632   
persistence_dev1:anonymity_dev1    6.706     26.168  43.950    0.26   0.7990   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prss_1 anny_1 age    fmTRUE pl_stn
prsstnc_dv1  0.018                                   
annymty_dv1  0.063  0.000                            
age         -0.758 -0.011 -0.058                     
femaleTRUE  -0.468 -0.016  0.049  0.251              
pol_stance  -0.546 -0.011 -0.067 -0.002  0.063       
prsstn_1:_1  0.023  0.001 -0.002 -0.053 -0.039  0.045
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Quite weird that topic doesn’t get any variance at all. Perhaps due to small cluster size? With Bayesian estimation, it worked alright. Also, again no significant effects.

Estimate without nesting.

fit_fe_2_frq <- 
  lmer(
    n_Words ~ 
      1 + 
      (1 | group) +
      persistence_dev * anonymity_dev + age + female + pol_stance + topic
    , data = d
    )

summary(fit_fe_2_frq)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: n_Words ~ 1 + (1 | group) + persistence_dev * anonymity_dev +  
    age + female + pol_stance + topic
   Data: d

REML criterion at convergence: 15009

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-1.200 -0.508 -0.233  0.219 13.366 

Random effects:
 Groups   Name        Variance Std.Dev.
 group    (Intercept)  14508   120     
 Residual             381804   618     
Number of obs: 960, groups:  group, 48

Fixed effects:
                                Estimate Std. Error      df t value Pr(>|t|)  
(Intercept)                      267.434    111.482 618.730    2.40    0.017 *
persistence_dev1                 -28.506     26.462  41.531   -1.08    0.288  
anonymity_dev1                   -24.265     26.622  42.515   -0.91    0.367  
age                                3.680      1.762 948.098    2.09    0.037 *
femaleTRUE                       -59.385     43.404 944.642   -1.37    0.172  
pol_stance                         0.324     10.160 947.596    0.03    0.975  
topicgender                      -13.401     64.870  41.660   -0.21    0.837  
topicmigration                    42.524     64.842  41.590    0.66    0.516  
persistence_dev1:anonymity_dev1    6.660     26.531  41.946    0.25    0.803  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prss_1 anny_1 age    fmTRUE pl_stn tpcgnd tpcmgr
prsstnc_dv1  0.016                                                 
annymty_dv1  0.059  0.000                                          
age         -0.722 -0.011 -0.057                                   
femaleTRUE  -0.435 -0.016  0.048  0.250                            
pol_stance  -0.505 -0.010 -0.066 -0.003  0.064                     
topicgender -0.285  0.000 -0.002  0.020 -0.028 -0.024              
topicmigrtn -0.300  0.000 -0.001  0.028  0.000 -0.018  0.500       
prsstn_1:_1  0.022  0.001 -0.002 -0.052 -0.039  0.044 -0.001 -0.002

Also shows no significant effects.

For curiosity, estimate also without hierarchical structure.

fit_fe_3_frq <- 
  lm(
    n_Words ~ 
      1 + 
      persistence_dev * anonymity_dev + topic + age + female + pol_stance
    , data = d
    )

summary(fit_fe_3_frq)

Call:
lm(formula = n_Words ~ 1 + persistence_dev * anonymity_dev + 
    topic + age + female + pol_stance, data = d)

Residuals:
   Min     1Q Median     3Q    Max 
  -604   -327   -150    125   8448 

Coefficients:
                                Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       239.86     108.03    2.22    0.027 *
persistence_dev1                  -28.61      20.28   -1.41    0.159  
anonymity_dev1                    -25.20      20.49   -1.23    0.219  
topicgender                       -13.59      49.74   -0.27    0.785  
topicmigration                     42.39      49.71    0.85    0.394  
age                                 3.91       1.77    2.21    0.027 *
femaleTRUE                        -60.28      43.67   -1.38    0.168  
pol_stance                          3.77      10.21    0.37    0.712  
persistence_dev1:anonymity_dev1     6.94      20.37    0.34    0.733  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 628 on 951 degrees of freedom
Multiple R-squared:  0.0139,    Adjusted R-squared:  0.00557 
F-statistic: 1.67 on 8 and 951 DF,  p-value: 0.101

Also here, no significant effects.

Gender

As preregistered, let’s see if effects differ across genders.

fit_fe_gen <- 
  hush(
    brm(
      n_Words ~ 
        1 + persistence_dev * anonymity_dev * gender + age + pol_stance +
        (1 | topic/group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 8000
      , warmup = 2000
      , family = zero_inflated_poisson()
      , control = list(
        adapt_delta = .95
        , max_treedepth = 12
        )
      )
  )
Warning: There were 815 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 1125 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems

Again, some warnings.

Let’s inspect model.

plot(fit_fe_gen, ask = FALSE)

Traceplots look alright.

Let’s look at results.

summary(fit_fe_gen)
Warning: There were 815 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: zero_inflated_poisson 
  Links: mu = log; zi = identity 
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev * gender + age + pol_stance + (1 | topic/group) 
   Data: d (Number of observations: 960) 
  Draws: 4 chains, each with iter = 8000; warmup = 2000; thin = 1;
         total post-warmup draws = 24000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.16      0.21     0.00     0.74 1.00     2230     1189

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.36      0.04     0.29     0.45 1.00     5390     8676

Regression Coefficients:
                                           Estimate Est.Error l-95% CI u-95% CI
Intercept                                      5.66      0.12     5.41     5.91
persistence_dev1                               0.01      0.05    -0.09     0.12
anonymity_dev1                                -0.10      0.05    -0.21    -0.00
gendermale                                     0.11      0.00     0.11     0.12
age                                            0.01      0.00     0.01     0.01
pol_stance                                    -0.02      0.00    -0.02    -0.01
persistence_dev1:anonymity_dev1                0.04      0.05    -0.07     0.14
persistence_dev1:gendermale                   -0.12      0.00    -0.12    -0.11
anonymity_dev1:gendermale                      0.13      0.00     0.12     0.13
persistence_dev1:anonymity_dev1:gendermale     0.04      0.00     0.04     0.05
                                           Rhat Bulk_ESS Tail_ESS
Intercept                                  1.00     4483     1383
persistence_dev1                           1.00     6205     8840
anonymity_dev1                             1.00     6211     7914
gendermale                                 1.00    20263    13441
age                                        1.00    22439    18918
pol_stance                                 1.00    21345    13579
persistence_dev1:anonymity_dev1            1.00     5838     6616
persistence_dev1:gendermale                1.00    14006    12104
anonymity_dev1:gendermale                  1.00    16834    13056
persistence_dev1:anonymity_dev1:gendermale 1.00    19399    13871

Further Distributional Parameters:
   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi     0.21      0.01     0.18     0.23 1.00    16050    12599

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Indeed, several gender effects.

  • For females, the effect of persistence is larger, that is more positive.
  • For females, the effect of anonymity is smaller, that is more negative.
  • For females, the interaction effect is also a bit smaller, that is more negative.

Let’s visualize results.

p_gen <- plot(
  conditional_effects(
    fit_fe_gen
    ), 
  ask = FALSE,
  plot = FALSE
  )

p_gen_pers <- 
  p_gen[["persistence_dev:gender"]] +
  xlab("Persistence") +
  ylab("Words") +
  # scale_y_continuous(
  #   limits = c(4, 15),
  #   breaks = c(5, 7.5, 10, 12.5, 15)
  # ) +
  scale_x_discrete(
    limits = rev
  ) +
  guides(
    fill = "none"
    , color = "none"
    )

p_gen_anon <- 
  p_gen[["anonymity_dev:gender"]] +
  xlab("Anonymity") +
  ylab("Words") +
  # scale_y_continuous(
  #   limits = c(3.5, 15),
  #   breaks = c(5, 7.5, 10, 12.5, 15)
  # ) +
  theme(
    axis.title.y = element_blank()
    ) +
  guides(
    fill = "none"
    ) + 
  scale_x_discrete(
    limits = rev
  ) +
  scale_color_discrete(
    name = "Gender"
    )

plot_gen <- cowplot::plot_grid(
  p_gen_pers, p_gen_anon, 
  labels = c('A', 'B'), 
  nrow = 1,
  rel_widths = c(4, 5)
  )

plot_gen

ggsave("figures/results_gen.png", plot_gen, width = 8, height = 4)

Benefits

Let’s see if benefits differ across experimental groups.

We first look at the experimental group’s descriptives

d |> 
  group_by(persistence) |> 
  summarize(benefits_m = mean(benefits, na.rm = TRUE)) |> 
  as.data.frame() |> 
  kable()
persistence benefits_m
high 3.12
low 3.23

Looking at persistence, we see people with lower persistence reporting slightly higher benefits.

d |> 
  group_by(anonymity) |> 
  summarize(benefits_m = mean(benefits, na.rm = TRUE)) |> 
  as.data.frame() |> 
  kable()
anonymity benefits_m
high 3.15
low 3.20

Looking at anonymity, we see people with low anonymity reporting marginally higher benefits.

d |> 
  group_by(persistence, anonymity) |> 
  summarize(benefits_m = mean(benefits, na.rm = T)) |> 
  as.data.frame() |> 
  kable()
`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
persistence anonymity benefits_m
high high 3.07
high low 3.18
low high 3.22
low low 3.23

Looking at both groups combined, we see that low anonymity and low persistence yielded highest benefits.

Let’s look if effects are significant.

fit_fe_ben_1 <- 
  hush(
    brm(
      benefits ~ 
        1 + persistence_dev * anonymity_dev  + age + female + pol_stance +
        (1 | topic/group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 6000
      , warmup = 2000
      , control = list(
        adapt_delta = .95
        , max_treedepth = 12
        )
      )
  )
Warning: Rows containing NAs were excluded from the model.
Warning: There were 119 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems

Let’s inspect model.

plot(fit_fe_ben_1, ask = FALSE)

Traceplots look alright.

Let’s look at results.

summary(fit_fe_ben_1)
Warning: There were 119 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: benefits ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group) 
   Data: d (Number of observations: 705) 
  Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.13      0.18     0.00     0.67 1.00     2834     1761

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.08      0.05     0.00     0.18 1.00     3508     4024

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                           3.22      0.17     2.87     3.55 1.00
persistence_dev1                   -0.05      0.03    -0.11     0.01 1.00
anonymity_dev1                     -0.03      0.03    -0.09     0.03 1.00
age                                -0.00      0.00    -0.01     0.00 1.00
femaleTRUE                         -0.07      0.06    -0.19     0.04 1.00
pol_stance                          0.02      0.01    -0.01     0.04 1.00
persistence_dev1:anonymity_dev1    -0.02      0.03    -0.08     0.04 1.00
                                Bulk_ESS Tail_ESS
Intercept                           5256     3238
persistence_dev1                   13714     8987
anonymity_dev1                     15562     9606
age                                22366    11104
femaleTRUE                         13893     9395
pol_stance                         16363    11427
persistence_dev1:anonymity_dev1    11999     8868

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.74      0.02     0.70     0.78 1.00    15218    10579

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

No significant effects. But note that effect of persistence on perceived benefits only marginally not significant.

Costs

Let’s see if perceived differed across experimental groups.

We first look at the experimental group’s descriptives

d |> 
  group_by(persistence) |> 
  summarize(costs = mean(costs, na.rm = TRUE)) |> 
  as.data.frame() |> 
  kable()
persistence costs
high 1.99
low 1.99

Looking at persistence, we see both groups report equal costs.

d |> 
  group_by(anonymity) |> 
  summarize(costs = mean(costs, na.rm = TRUE)) |> 
  as.data.frame() |> 
  kable()
anonymity costs
high 1.89
low 2.09

Looking at anonymity, we see people with low anonymity report slightly higher costs.

d |> 
  group_by(persistence, anonymity) |> 
  summarize(costs = mean(costs, na.rm = TRUE)) |> 
  as.data.frame() |> 
  kable()
`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
persistence anonymity costs
high high 1.90
high low 2.07
low high 1.87
low low 2.11

Looking at both groups combined, we see that highest costs were reported by group with low anonymity and low persistence.

Let’s look if effects are significant.

fit_fe_costs_1 <- 
  hush(
    brm(
      costs ~ 
        1 + persistence_dev * anonymity_dev + age + female + pol_stance +
        (1 | topic/group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 8000
      , warmup = 2000
      , control = list(
        adapt_delta = .95
        , max_treedepth = 12
        )
      )
  )
Warning: Rows containing NAs were excluded from the model.
Warning: There were 119 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems

Let’s inspect model.

plot(fit_fe_costs_1, ask = FALSE)

Traceplots look alright.

Let’s look at results.

summary(fit_fe_costs_1)
Warning: There were 119 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: costs ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group) 
   Data: d (Number of observations: 705) 
  Draws: 4 chains, each with iter = 8000; warmup = 2000; thin = 1;
         total post-warmup draws = 24000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.14      0.22     0.00     0.79 1.00     4391     3051

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.07      0.05     0.00     0.17 1.00     7582     9602

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                           2.48      0.20     2.09     2.87 1.00
persistence_dev1                    0.00      0.03    -0.06     0.07 1.00
anonymity_dev1                     -0.08      0.03    -0.15    -0.02 1.00
age                                -0.01      0.00    -0.02    -0.01 1.00
femaleTRUE                          0.01      0.07    -0.12     0.14 1.00
pol_stance                         -0.00      0.02    -0.04     0.03 1.00
persistence_dev1:anonymity_dev1     0.02      0.03    -0.05     0.09 1.00
                                Bulk_ESS Tail_ESS
Intercept                           7138     4975
persistence_dev1                   27961    11041
anonymity_dev1                     31334    16273
age                                40467    19477
femaleTRUE                         36893    17982
pol_stance                         42598    16990
persistence_dev1:anonymity_dev1    31312    16982

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.84      0.02     0.80     0.89 1.00    35120    17084

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

We find that anonymity does reduce costs.

Mediation

Let’s see if perceived benefits and costs were associated with increased words communicated.

fit_fe_med <- 
  hush(
    brm(
      n_Words ~ 
        1 + persistence_dev * anonymity_dev + benefits + costs  + age + female + pol_stance + 
        (1 | topic/group)
      , data = d
      , chains = 4
      , cores = 4
      , iter = 6000
      , warmup = 2000
      , family = zero_inflated_poisson()
      , control = list(
        adapt_delta = .95
        , max_treedepth = 12
        )
      )
  )
Warning: Rows containing NAs were excluded from the model.
Warning: There were 396 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 1733 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems

Let’s look at results.

summary(fit_fe_med)
Warning: There were 396 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
 Family: zero_inflated_poisson 
  Links: mu = log; zi = identity 
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + benefits + costs + age + female + pol_stance + (1 | topic/group) 
   Data: d (Number of observations: 705) 
  Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
         total post-warmup draws = 16000

Multilevel Hyperparameters:
~topic (Number of levels: 3) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.20      0.25     0.01     0.92 1.00     2579     4250

~topic:group (Number of levels: 48) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.39      0.04     0.32     0.48 1.00     3713     5225

Regression Coefficients:
                                Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept                           5.99      0.15     5.67     6.31 1.00
persistence_dev1                   -0.04      0.06    -0.15     0.07 1.00
anonymity_dev1                     -0.04      0.06    -0.15     0.08 1.00
benefits                            0.06      0.00     0.06     0.07 1.00
costs                              -0.14      0.00    -0.14    -0.14 1.00
age                                 0.01      0.00     0.01     0.01 1.00
femaleTRUE                         -0.09      0.00    -0.10    -0.09 1.00
pol_stance                         -0.01      0.00    -0.02    -0.01 1.00
persistence_dev1:anonymity_dev1     0.05      0.06    -0.06     0.16 1.00
                                Bulk_ESS Tail_ESS
Intercept                           5407     4531
persistence_dev1                    4451     5678
anonymity_dev1                      4462     5930
benefits                           14066     8723
costs                              13336     8776
age                                17268    13032
femaleTRUE                         13331     8887
pol_stance                         14543     9796
persistence_dev1:anonymity_dev1     4646     5518

Further Distributional Parameters:
   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi     0.08      0.01     0.06     0.10 1.00    13751     9148

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

We find that increased perceived costs are associated with decreased words communicated. Increased benefits are associated with increased words communicated. Let’s check if overall effect is significant.

anon_costs_a_b <- fixef(fit_fe_costs_1)["anonymity_dev1", "Estimate"]
anon_costs_a_se <- fixef(fit_fe_costs_1)["anonymity_dev1", "Est.Error"]
anon_costs_a_dis <- rnorm(10000, anon_costs_a_b, anon_costs_a_se)

anon_costs_b_b <- fixef(fit_fe_med)["benefits", "Estimate"]
anon_costs_b_se <- fixef(fit_fe_med)["benefits", "Est.Error"]
anon_costs_b_dis <- rnorm(10000, anon_costs_b_b, anon_costs_b_se)

anon_costs_ab_dis <- anon_costs_a_dis * anon_costs_b_dis
anon_costs_ab_m <- median(anon_costs_ab_dis)
anon_costs_ab_ll <- quantile(anon_costs_ab_dis, .025)
anon_costs_ab_ul <- quantile(anon_costs_ab_dis, .975)

The effect is significant (b = -0.01, 95% MC CI [-0.01, 0]).

Save

save.image("data/image_words.RData")